Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: apply upstream commit for "optimizer config param" #16

Closed
wants to merge 3 commits into from

Conversation

appletreeisyellow
Copy link

@appletreeisyellow appletreeisyellow commented Apr 30, 2024

⚠️ This will not be merged. ⚠️

  1. This temporary branch is to help bringing @NGA-TRAN's upstream PR feat: add optimizer config param to avoid grouping partitions prefer_existing_union apache/datafusion#10259 into IOx earily

  2. This PR is based on April 23, 2024 apache@65ecfda

git co -b chunchun/update-df-apr-week-4-2 65ecfda84f7a105412bbf4040885a1b1774668d1
  1. Applied the following patch(es):

    1. Cherry picked Allow adding user defined metadata to ParquetSink apache/datafusion#10224 /
      apache@9c8873a

      commit 9c8873af12826e47f5743991859790df7a3b6400
      Author: wiedld <wiedld@users.noreply.github.com>
      Date:   Fri Apr 26 03:42:16 2024 -0700
      
          Allow adding user defined metadata to `ParquetSink` (#10224)
    2. cherry picked fix: no longer support the substring function apache/datafusion#10242 / f8c623f

      commit f8c623fe045d70a87eac8dc8620b74ff73be56d5
      Author: Jonah Gao <jonahgao@msn.com>
      Date:   Sat Apr 27 02:30:09 2024 +0800
    3. cherry picked feat: add optimizer config param to avoid grouping partitions prefer_existing_union apache/datafusion#10259 / apache@2231183

      commit 22311835bc1b4bd83b50e1c3875b0e725622b872
      Author: Nga Tran <nga-tran@live.com>
      Date:   Tue Apr 30 11:45:34 2024 -0400

wiedld and others added 2 commits April 30, 2024 10:06
* chore: make explicit what ParquetWriterOptions are created from a subset of TableParquetOptions

* refactor: restore the ability to add kv metadata into the generated file sink

* test: demomnstrate API contract for metadata TableParquetOptions

* chore: update code docs

* fix: parse on proper delimiter, and improve tests

* fix: enable any character in the metadata string value, by having any key parsing be a part of the format.metadata::key
* fix: no longer support the `substring` function

* enable from-for format

* update test comment

* review feedback

* review feedback

Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>

---------

Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>
…_existing_union` (apache#10259)

* feat: add a config param to avoid converting union to interleave

* chore: update config for the tests

* chore: update configs.md
@appletreeisyellow appletreeisyellow changed the title WIP(iox-10578): patched df upgrade 202-04-TBD WIP: apply upstream commit for "optimizer config param" Apr 30, 2024
@NGA-TRAN
Copy link

Thanks @appletreeisyellow
Sorry for my ignorance so far. What is the process here? Should someone/me approve this PR?

@appletreeisyellow
Copy link
Author

@NGA-TRAN No need to approve this PR. This PR is just a temporary branch to bring in upstream patches.

I opened a PR in influxdb_iox that actually upgrades the DF version that includes your work: https://github.com/influxdata/influxdb_iox/pull/10834 <-- this is what you need to approve

@appletreeisyellow
Copy link
Author

The upgrade is done. Closing

@appletreeisyellow appletreeisyellow deleted the chunchun/update-df-apr-week-4-2 branch May 1, 2024 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants